CLAIMS 

What is claimed is: 



1 1 . A\method of providing information about a product, the product available for 

2 purchase from a plurality of sources, the method comprising: 

3 receiving a selection of a product category from a predefined set of product 

4 categoVes using information about the product; 

5 accessing a l^t of extraction parameters for the product category; 

6 receiving a selection of at least one extraction parameter in the list of extraction 

7 parameters; \ x 

8 for each of the pluraliW of sources, creating a correspopfdipg progrmn including 

9 identifying a correspViding web site, the corresponding site selling the 

1 0 product and \ 

1 1 providing a tool for creatine the corresponding program to extract data from 

1 2 the web site using the at le^ast one extrac tion-parameter; 

1 3 receiving a connection from a client^the coimection including a value for the at 

14 least one extraction parameter; and\ 

1 5 for each of the plurality of sources in the product category, providing product 

1 6 information for the product using the val\ie for the at least one extraction 

1 7 parameter and the corresponding program. \ 

1 2. The method of claim 1, wherein the providing the to^pl for creating the 

2 corresponding program to extract data from the corresponding^web site using the at least 

3 one extraction parameter further comprises providing a graphical user interface tool for 

4 developing the corresponding program, the graphical user interfacestool including a web 

5 browser. \ 
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1 3. The method of claim 2, wherein the graphical user interface tool ftirfher includes a 

2 first tool, the first tool for developing an extraction pattern, the extraction pattern 

3 identifying a plurality of portions of a document on the corresponaing web site. 



The method of claim 3, wherein the graphical user ii^rface tool further comprises: 
receiving a selection signal; 

3 applying the extraction pattern to find a matchi/(g pattern in a document displayed 

4 in a source view in the web browser; 
^ displaying a rendered version of the matclafing pattern in a window. 

5. The method of claim 3, wherein the graphical user interface tool further includes a 
plurality of. predefined extraction patterns. 

6. The methodlof claim 5, wherein me plurality of predefined extraction patterns 
include at least one ©f an extraction pattern for matching a hyperlink, an extraction pattern 

3 for matching a form,\and an extractigii pattern for matching a price. 

1 7. The method of claim 3, wherein the graphical user interface tool further comprises: 

2 identifying a form on the iiocument on the corresponding web site; 

3 creating a step in the coiresponding program, the step to submit the form without 

4 retrieving the document; 

5 generating a plurality of parameters associated with the step, the plurality of 

6 parameters corresponding to inputs in the form; and 

7 identifying at least ^ne of the plurality of parameters with the at least one 

8 extraction parafneter. 
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8. The method of claim 1 , wherein the providing the tool for creating the 
corresponding program to extract data from the corresponding web site using the at least 
one extraction parameter further comprises defining a plurality of steps wherein at least 
one step in the plurality of steps interacts with the corresponding web site and operates on 
the results of the interaction. 

9. The memod of claim 8, wherein the defining the pluraliw of steps comprises for 
each of the plurality of steps, receiving a selection of an extraction command from a 
predetermined list of extraction commands. 

10. The method of claim 9, wherein the predetermilied list of extraction commands 
includes extraction commands for retrieving multiple matches of an extraction pattern 
from a document. 

1 1 . The method of claim 9, wherein the predetermined list of extraction commands 
includes extraction commands for extracting data from a first document and a second 



/ 



document, the first document including a reference to the second document. 

12. The method of claim 9, wherein at least one step in the plurality of steps includes a 
test condition comprising a logical test foe at least one corresponding argument and a first 
step in the plurality of steps, and wherein^ the program continues executing at the first step 
if the logical test is satisfied. 

13. The method of claim 12, whereip the at least one corresponding argument includes 
an extraction pattern. 



^14. The method of claim 1^, wherpn the test condition further comprises a result code, 
wherein the program retums an error If the result code is a web site changed result code. 
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15. The method of claim 12, wherein the test condition further comprises a result code, 
wherein the program retipis an error if the result code is a no matching produ^ resuh 

r-eod©: ■ 

16. The method of claim 9, wherein the predetermined list of extraction commands 
includes extraction commands for segmenting a document into a^plurality of units, each of 
the plurality of units matching an extraction pattern. 

17. The method of claim 16, wherein at least one st^ in the plurality of steps uses an 
extraction command to segment a document into a nmrality of units, and wherein the step 
further includes a test condition, the test condition comprising a logical test and at least 
one argument, and wherein for each of the plurality of units, the logical test is computed 
with the at least one argument, and the unit is removed from the plurality of units if the 
logical test is not satisfied with the at least one argument. 

18. Ai^apparatus for providing information about a product, the product available for 
purchase from a^Jkirality of sources, the apparatus comprising: 

means for receivmgsa selection of a product category from a predefined set of 

product categories usmg^iriformation about the product; 
means for accessing a list of extrabtion parameters for the product category; 
means for receiving a selection of at leasbone extraction parameter in the list of 

extraction parameters; 
means for creating a corresponding program for each oHl^ plurality of sources, 

the means for creating a corresponding program for each oNlie plurality of 

sources including 
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leans for identifying a corresponding web site, the corr^^onding web site 
selling the product and 
means for^s^ating the corresponding program to/fextract data from the web site 
using the at r^st one extraction parameter y 
means for receiving a conhection from a client, ^e connection including a value 

for the at least one extractior^^rameter; 
means for providing product informatibn for t/he product from each of the plurality 
of sources using the value for the at leas^s^e extraction parameter and the 
corresponding program. 



1 19. The apparatus of claim 18, wherein the means for creating a corresponding 

2 program to extract data from the web site includes means for selecting an instruction from 

3 a predetermined list of instructions. 

1 20. The apparatus of claim 1 8, whereiiy the means for creating a corresponding 

2 program to extract data from the web site/includes means for developing an extractor 

3 pattem in a web browser. 



1 21 . ^^^^Qmputer data signal embodiJed in a carrier wave comprising: 

2 a computer'p*;Qgram for developing descriptions of data of interest 

3 a set of instructi6itSs|br dqfveloping an extractor pattern interactively in a web 

4 page; 

5 a set of instructions for ribceiving a s^l«^ion of an instruction from a predefined 

6 set of instructions f^r inclusion of the insth^tion in the description of data 

7 of interest; 
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a sfetsof instructions for associating the extractor pattei;ri with the instruction; 
and 

a set of instruction^^r testing the instruction \)ding the extractor pattern and 
the contents of a buffers 

22. An apparatus comprising a computer, the cofeiputer comprising a processor and a 
memory, the memory including a plurality of de/criptions of data of interest, the processor 
rurming a program the program accepting an input and generating an output, the input 
identifying a subset of the plurality of descnptions of data of interest and a plurality of 
values for a plurality of extraction paranieters, the output including data of interest 
retrieved from a plurality of web sites oorresponding to data of interest matching the 
plurality of values for the plurality of Extraction parameters at each of the plurality of web 
sites corresponding to the subset of plurality of descriptions of data of interest. 
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